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Abstract 

Dynamic  task  allocation  is  an  essential  requirement  for  multi-robot 
systems  operating  in  unknown  dynamic  environments.  It  allows  robots  to 
change  their  behavior  in  response  to  environmental  changes  or  actions  of 
other  robots  in  order  to  improve  overall  system  performance.  Emergent 
coordination  algorithms  for  task  allocation  that  use  only  local  sensing  and 
no  direct  communication  between  robots  are  attractive  because  they  are 
robust  and  scalable.  However,  a  lack  of  formal  analysis  tools  makes  emer¬ 
gent  coordination  algorithms  difficult  to  design.  In  this  paper  we  present 
a  mathematical  model  of  a  general  dynamic  task  allocation  mechanism. 
Robots  using  this  mechanism  have  to  choose  between  two  types  of  task, 
and  the  goal  is  to  achieve  a  desired  task  division  in  the  absence  of  ex¬ 
plicit  communication  and  global  knowledge.  Robots  estimate  the  state  of 
the  environment  from  repeated  local  observations  and  decide  which  task 
to  choose  based  on  these  observations.  We  model  the  robots  and  obser¬ 
vations  as  stochastic  processes  and  study  the  dynamics  of  the  collective 
behavior.  Specifically,  we  analyze  the  effect  that  the  number  of  obser¬ 
vations  and  the  choice  of  the  decision  function  have  on  the  performance 
of  the  system.  The  mathematical  models  are  validated  in  a  multi-robot 
multi-foraging  scenario.  The  model’s  predictions  agree  very  closely  with 
experimental  results  from  sensor-based  simulations. 


1  Introduction 

In  the  1980’s  it  was  considered  ground-breaking  for  a  mobile  robot  to  move 
around  an  unstructured  environment  at  reasonable  speeds.  In  the  years  since, 
advancements  in  both  hardware  mechanisms  and  software  architectures  and 
algorithms  have  resulted  in  quite  capable  mobile  robot  systems.  Provided  with 
this  baseline  competency  of  individual  robots,  increasing  attention  has  been  paid 
to  the  study  of  Multi-Robot  Systems  (MRS),  and  in  particular  distributed  MRS 
with  which  the  remainder  of  this  paper  is  concerned.  In  a  distributed  MRS  there 
is  no  centralized  control  mechanism  -  instead,  each  robot  operates  independently 
under  local  sensing  and  control,  with  coordinated  system-level  behavior  arising 
from  local  interactions  among  the  robots  and  between  the  robots  and  the  task 
environment.  The  effective  design  of  coordinated  MRS  is  restricted  by  the  lack 
of  formal  design  tools  and  methodologies.  The  design  of  single  robot  systems 
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(SRS)  has  greatly  benefited  from  the  formalisms  provided  by  control  theory  - 
the  design  of  MRS  is  in  need  of  analogous  formalisms. 

For  a  group  of  robots  to  effectively  perform  a  given  system-level  task,  the 
designer  must  address  the  question  of  which  robot  should  do  which  task  and 
when  [4].  The  process  of  assigning  individual  robots  to  sub-tasks  of  a  given 
system-level  task  is  called  task  allocation,  and  it  is  a  key  functionality  required 
of  any  MRS.  Dynamic  task  allocation  is  a  class  of  task  allocation  in  which 
the  assignment  of  robots  to  sub-tasks  is  a  dynamic  process  and  may  need  to 
be  continuously  adjusted  in  response  to  changes  in  the  task  environment  or 
group  performance.  The  problem  of  task  allocation  in  a  distributed  MRS  is 
further  compounded  by  the  fact  that  task  allocation  must  occur  as  a  result 
of  a  distributed  process  as  there  is  no  central  coordinator  available  to  make 
task  assignments.  This  increases  the  problem’s  complexity  because,  due  to  the 
local  sensing  of  each  robot,  no  robot  has  a  complete  view  of  the  world  state. 
Given  this  incomplete  and  often  noisy  information,  each  robot  must  make  local 
control  decisions  about  which  actions  to  perform  and  when,  without  complete 
knowledge  of  what  other  robots  have  done  in  the  past,  are  doing  now,  or  will 
do  in  the  future. 

There  are  a  number  of  task  allocation  models  and  philosophies.  Historically, 
the  most  popular  approaches  rely  on  intentional  coordination  to  achieve  task 
allocation  [21].  In  those,  the  robots  coordinate  their  respective  actions  explicitly 
through  deliberate  communications  and  negotiations.  Due  to  scaling  issues,  such 
approaches  are  primarily  used  in  MRS  consisting  of  a  relatively  small  number 
of  robots  (i.e.,  fewer  than  10).  Task  allocation  through  intentional  coordination 
remains  the  preferred  approach  because  it  is  better  understood,  easier  to  design 
and  implement,  and  more  amenable  to  formal  analysis  [4]. 

As  the  size  of  the  MRS  grows,  the  complexity  of  the  design  of  intentional  ap¬ 
proaches  increases  due  to  increased  demands  in  communication  bandwidth  and 
computational  abilities  of  individual  robots.  Furthermore,  complexity  intro¬ 
duced  by  increased  robot  interactions  makes  such  systems  much  more  difficult 
to  analyze  and  design.  This  leads  to  the  alternative  to  intentional  coordination, 
namely,  task  allocation  through  utilizing  emergent  coordination.  In  systems 
using  emergent  coordination,  individual  robots  coordinate  their  actions  based 
solely  on  local  sensing  information  and  local  interactions.  Typically,  there  is 
very  little  or  no  direct  communication  or  explicit  negotiations  between  robots. 
They  are,  therefore,  more  scalable  to  larger  numbers  of  robots  and  are  more 
able  to  take  advantage  of  the  robustness  and  parallelism  provided  by  the  aggre¬ 
gation  of  large  numbers  of  coordinated  robots.  The  drawback  of  task  allocation 
as  achieved  through  emergent  coordination  mechanisms  is  that  such  systems 
can  be  difficult  to  design,  solutions  are  commonly  sub-optimal,  and  since  co¬ 
ordination  is  achieved  through  many  simultaneous  local  interactions  between 
various  subsets  of  robots,  predictive  analysis  of  expected  system  performance  is 
difficult. 

As  MRS  composed  of  ever-larger  numbers  of  robots  become  available,  the 
need  for  task  allocation  through  emergent  coordination  will  increase.  To  address 
the  lack  of  formalisms  in  the  design  of  such  MRS,  in  this  article  we  present 


and  experimentally  verify  a  predictive  mathematical  model  of  dynamic  task 
allocation  for  MRS  using  emergent  coordination.  Such  a  formal  model  of  task 
allocation  is  a  positive  step  in  the  direction  of  placing  the  design  of  MRS  on  a 
formal  footing. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2  provides  a  summary 
of  related  work.  In  Section  3  we  describe  a  general  mechanism  for  task  alloca¬ 
tion  in  dynamic  environments.  This  is  a  distributed  mechanism  based  on  local 
sensing.  In  Section  4  we  present  a  mathematical  model  of  the  collective  behav¬ 
ior  of  an  MRS  using  this  mechanism  and  study  its  performance  under  a  variety 
of  conditions.  We  validate  the  model  in  a  multi-foraging  domain.  In  Section  5 
we  define  the  experimental  task  domain  of  multi-foraging,  robot  controllers  and 
the  simulation  environment.  In  Section  6  we  compare  the  predictions  of  math¬ 
ematical  models  with  the  results  from  sensor-based  simulations.  We  conclude 
the  paper  in  Section  7,  with  a  discussion  of  the  approach  and  the  results. 

2  Related  Work 

Mathematical  modeling  and  analysis  of  the  collective  behavior  of  MRS  is  a 
relatively  new  field  with  approaches  and  methodologies  borrowed  from  other 
fields,  including  mathematics,  physics,  and  biology.  Recently,  a  number  of  re¬ 
searchers  attempted  to  mathematically  analyze  multi-robot  systems  by  using 
phenomenological  models  of  the  type  present  here.  Sugawara  et  al.  [23,  24] 
developed  a  simple  model  of  cooperative  foraging  in  groups  of  communicating 
and  non-communicating  robots.  Kazadi  et  al.  [11]  studied  the  general  prop¬ 
erties  of  multi-robot  aggregation  using  phenomenological  macroscopic  models. 
Agassounon  and  Martinoli  [1]  presented  a  model  of  aggregation  in  which  the 
number  of  robots  taking  part  in  the  clustering  task  is  based  on  the  division  of 
labor  mechanism  in  ants.  These  models  are  ad-hoc  and  domain  specific,  and  the 
authors  give  no  explanation  as  to  how  to  apply  such  models  to  other  domain.  In 
earlier  works  we  have  developed  a  general  framework  for  creating  phenomeno¬ 
logical  models  of  collective  behavior  in  groups  of  robots  [16,  18].  We  applied  this 
framework  to  study  collaborative  stick-pulling  in  a  group  of  reactive  robots  [17] 
and  foraging  in  robots  [13]. 

Most  of  the  approaches  listed  above  are  implicitly  or  explicitly  based  on 
stochastic  processes  theory.  Another  example  of  the  stochastic  approach  is  the 
probabilistic  microscopic  model  developed  by  Martinoli  and  coworkers  [19,  20,  8] 
to  study  collective  behavior  of  a  group  of  robots.  Rather  than  compute  the 
exact  trajectories  and  sensory  information  of  individual  robots,  Martinoli  et  al. 
model  each  robot’s  interactions  with  other  robots  and  the  environment  as  a 
series  of  stochastic  events,  with  probabilities  determined  by  simple  geometric 
considerations.  Running  several  series  of  stochastic  events  in  parallel,  one  for 
each  robot,  allowed  them  to  study  the  group  behavior  of  the  multi-robot  system. 

So  far  very  little  work  has  been  done  on  mathematical  analysis  of  multi-robot 
systems  in  dynamic  environments.  We  have  recently  extended  [14]  the  stochas¬ 
tic  processes  framework  developed  in  earlier  work  to  robots  that  change  their 


behavior  based  on  history  of  local  observations  of  the  (possibly  changing)  envi¬ 
ronment  [15].  In  the  current  paper  we  develop  these  ideas  further,  and  present 
the  exact  stochastic  model  of  the  system,  in  addition  to  the  phenomenological 
model. 

Closest  to  ours  is  the  work  of  Huberman  and  Hogg  [7] ,  who  mathematically 
studied  collective  behavior  of  a  system  of  adaptive  agents  using  game  dynamics 
as  a  mechanism  for  adaptation.  In  game  dynamical  systems,  winning  strategies 
are  rewarded,  and  agents  use  the  best  performing  strategies  to  decide  their  next 
move.  Although  their  adaptation  mechanism  is  different  from  our  dynamic 
task  allocation  mechanism,  their  analytic  approach  is  similar  to  ours,  in  that 
it  is  based  on  the  theory  of  stochastic  processes.  Others  have  mathematically 
studied  collective  behavior  of  systems  composed  of  large  numbers  of  concurrent 
learners  [25,  22].  These  are  microscopic  models,  which  only  allow  one  to  study 
collective  behavior  of  relatively  small  systems  of  a  few  robots.  We  are  interested 
in  macroscopic  approaches  that  enable  us  to  directly  study  collective  behavior 
in  large  systems.  Our  work  differs  from  earlier  ones  in  another  important  way: 
we  systematically  compare  theoretical  predictions  of  mathematical  models  with 
results  of  experiments  carried  out  in  a  sensor-based  simulator. 


3  Dynamic  Task  Allocation  Mechanism 

The  dynamic  task  allocation  scenario  we  study  considers  a  world  populated  with 
tasks  of  T  different  types  and  robots  that  are  equally  capable  of  performing  each 
task  but  can  only  be  assigned  to  one  type  at  any  given  time.  For  example,  the 
tasks  could  be  targets  of  different  priority  that  have  to  be  tracked,  different 
types  of  explosives  that  need  to  be  located,  etc.  Additionally,  a  robot  cannot  be 
idle  —  each  robot  is  always  performing  a  task  at  any  given  time.  We  introduce 
the  notion  of  a  robot  state  as  a  shorthand  for  the  type  of  task  the  robot  is 
assigned  to  service.  A  robot  may  switch  its  state  according  to  its  control  policy 
when  it  determines  it  is  appropriate  to  do  so.  However,  needlessly  switching 
tasks  is  to  be  avoided,  since  in  physical  robot  systems,  this  can  involve  complex 
physical  movement  that  requires  time  to  perform. 

The  purpose  of  task  allocation  is  to  assign  robots  to  tasks  in  a  way  that 
will  enhance  the  performance  of  the  system,  which  typically  means  reducing 
the  overall  execution  time.  Thus,  if  all  tasks  take  an  equal  amount  of  time  to 
complete,  in  the  best  allocation,  the  fraction  of  robots  in  state  i  will  be  equal 
to  the  fraction  of  tasks  of  type  i.  In  general,  however,  the  desired  allocation 
could  take  other  forms  —  for  example,  it  could  be  related  to  the  relative  reward 
or  cost  of  completing  each  task  type  —  without  change  to  our  approach.  In 
the  dynamic  task  allocation  scenario,  the  number  of  tasks  and  the  number  of 
available  robots  are  allowed  to  change  over  time,  for  example,  by  adding  new 
tasks,  deploying  new  robots,  or  removing  malfunctioning  robots. 

The  challenge  faced  by  the  designer  is  to  devise  a  mechanism  that  will  lead  to 
a  desired  task  allocation  in  a  distributed  MRS  even  as  the  environment  changes. 
The  challenge  is  made  even  more  difficult  by  the  fact  that  robots  have  limited 


sensing  capabilities,  do  not  directly  communicate  with  other  robots,  and  there¬ 
fore,  cannot  acquire  global  information  about  the  state  of  the  world,  the  initial 
or  current  number  of  tasks  (total  or  by  type),  or  the  initial  or  current  number 
of  robots  (total  or  by  assigned  type).  Instead,  robots  can  sample  the  world 
(assumed  to  be  finite)  —  for  example,  by  moving  around  and  making  local  ob¬ 
servations  of  the  environment.  We  assume  that  robots  are  able  to  observe  tasks 
and  discriminate  their  types.  They  may  also  be  able  to  observe  and  discriminate 
the  task  states  of  other  robots. 

One  way  to  give  the  robot  an  ability  to  respond  to  environmental  changes 
(including  actions  of  other  robots)  is  to  give  a  robot  an  internal  state  where  it  can 
store  its  knowledge  of  the  environment  as  captured  by  its  observations  [9,  14]. 
The  observations  are  stored  in  a  rolling  history  window  of  finite  length,  with  new 
observations  replacing  the  oldest  ones.  The  robot  consults  these  observations 
periodically  and  updates  its  task  state  according  to  some  transition  function 
specified  by  the  designer.  In  an  earlier  work  we  showed  [9,  15]  that  this  simple 
dynamic  task  allocation  mechanism  leads  to  the  desired  task  allocation  in  a 
multi-foraging  scenario. 

In  the  following  sections  we  present  a  mathematical  model  of  dynamic  task 
allocation  and  study  the  role  that  transition  function  and  the  number  of  obser¬ 
vations  (history  length)  play  in  the  performance  of  a  multi-foraging  MRS.  In 
Section  4.1,  we  present  a  model  of  a  simple  scenario  in  which  robots  base  their 
decisions  to  change  state  solely  on  observations  of  tasks  in  the  environment.  We 
study  the  simplest  form  of  the  transition  function,  in  which  the  probability  to 
change  state  to  some  type  is  proportional  to  the  fraction  of  existing  tasks  of 
that  type.  In  Section  6.1  we  compare  theoretical  predictions  with  no  adjustable 
parameters  to  experimental  data  and  find  excellent  agreement.  In  Section  4.2 
we  examine  the  more  complex  scenario  where  the  robots  base  their  decisions 
to  change  task  state  on  the  observations  of  both  existing  task  types  and  task 
states  of  other  robots.  In  Section  6.2  we  study  the  consequences  of  the  choice 
of  the  transition  function  and  history  length  on  the  system  behavior  and  find 
good  agreement  with  the  experimental  data. 


4  Analysis  of  Dynamic  Task  Allocation 

As  proposed  in  the  previous  section,  a  robot  may  be  able  to  adapt  to  a  changing 
environment  in  the  absence  of  complete  global  knowledge  if  it  is  able  to  make 
and  remember  local  observations  of  the  environment.  In  the  treatment  below 
we  assume  that  there  are  two  types  of  tasks  —  arbitrarily  referred  to  as  Red 
and  Green.  This  simplification  is  for  pedagogical  reason  only;  the  model  can 
be  extended  to  a  greater  number  of  task  types. 

During  a  sufficiently  short  time  interval,  each  robot  can  be  considered  to 
belong  to  the  Green  or  Red  task  state.  This  is  a  very  high  level,  coarse-grained 
description.  In  reality,  each  state  is  composed  of  several  robot  actions  and 
behaviors,  for  example,  searching  for  new  tasks,  detecting  and  executing  them, 
avoiding  obstacles,  etc.  However,  since  we  want  the  model  to  capture  how  the 


fraction  of  robots  in  each  task  state  evolves  in  time,  it  is  a  sufficient  level  of 
abstraction  to  consider  only  these  two  states.  If  we  find  that  additional  levels 
of  detail  are  required  to  explain  system  behavior,  we  can  elaborate  the  model 
by  breaking  each  of  the  high  level  states  into  its  underlying  components. 

4.1  Observations  of  Tasks  Only 

In  this  section  we  study  dynamic  task  allocation  mechanism  in  which  robots 
make  decisions  to  switch  task  states  based  solely  on  observations  of  available 
tasks.  Let  mr  and  mg  be  the  numbers  of  the  observed  Red  and  Green  tasks, 
respectively,  in  a  robot’s  memory  or  history  window.  The  robot  chooses  to 
change  its  state,  or  the  type  of  task  it  is  assigned  to  execute,  with  probabilities 
given  by  transition  functions  /g_>r(mr,  ms)  (probability  of  switching  to  Red 
from  Green)  and  fr^g(mr,mg)  (probability  of  switching  to  Green  from  Red). 
We  would  like  to  define  transition  rules  so  that  the  fraction  of  time  the  robot 
spends  in  the  Red  (Green)  state  be  equal  to  the  fraction  of  Red  (Green)  tasks. 
This  will  assure  that  on  average  the  number  of  Red  and  Green  robots  reflect 
the  desired  task  distribution.  Clearly,  if  the  robots  have  global  knowledge  about 
the  numbers  of  Red  and  Green  tasks  Mr  and  Mg,  then  each  robot  could  choose 
each  state  with  probability  equal  to  the  fraction  of  the  tasks  of  corresponding 
type.  Such  global  knowledge  is  not  available;  hence,  we  want  to  investigate 
how  incomplete  knowledge  of  the  environment  (through  local  observations),  as 
well  as  the  dynamically  changing  environment  (e.g.,  changing  ratio  of  Red  and 
Green  tasks),  affects  task  allocation. 

4.1.1  Modelling  Robot  Observations 

As  explained  above,  the  transition  rate  between  task  execution  states  depends 
on  robot’s  observations  stored  in  its  history.  In  our  model  we  assume  that  a 
robot  makes  an  observation  of  a  task  with  a  time  period  r.  For  simplicity,  by 
an  observation  we  mean  here  detecting  a  task,  such  as  a  target  to  be  monitored, 
mine  to  be  cleared  or  an  object  to  be  gathered.  Therefore,  observation  history  of 
length  h  comprises  of  the  number  of  Red  and  Green  tasks  a  robot  has  observed 
during  a  time  interval  hr.  We  assume  that  r  has  unit  length  and  drop  it. 
The  process  of  observing  a  task  is  given  by  a  Poisson  distribution  with  rate 
A  =  aM°,  where  cc  is  a  constant  characterizing  the  physical  parameters  of  the 
robot  such  as  its  speed,  view  angles,  etc.,  and  M°  is  the  number  of  tasks  in  the 
environment.  This  simplification  is  based  on  the  idea  that  robot’s  interactions 
with  other  robots  and  the  environment  are  independent  of  the  robot’s  actual 
trajectory  and  are  governed  by  probabilities  determined  by  simple  geometric 
considerations.  This  simplification  has  been  shown  to  produce  remarkably  good 
agreements  with  experiments  [20,  8]. 

Let  Mr(t)  and  Mg(t)  be  the  number  of  Red  and  Green  tasks  respectively 
(can  be  time  dependent),  and  let  M(t)  =  Mr(t)  +  Mg(t)  be  the  total  number  of 
tasks.  The  probability  that  in  the  time  interval  [t  —  h,  t]  the  robot  has  observed 


exactly  mr  and  mg  tasks  is  the  product  of  two  Poisson  distributions: 


P(mr,mg) 


\  mr  \my 
Ar  Ag 

mr\mQ\ 


—Xr—Xa 


(i) 


where  Aj  ,  i  =  r,g,  are  the  means  of  the  respective  distributions.  If  the  task 
distribution  does  not  change  in  time,  A i  =  aM,h.  For  time  dependent  task 
distributions,  A =  a J  hdt'Mi(t'). 


4.1.2  Individual  Dynamics:  The  Stochastic  Master  Equation 

Let  us  consider  a  single  robot  that  has  to  decide  between  executing  Red  and 
Green  tasks  in  a  closed  arena  and  makes  a  transition  to  Red  and  Green  states 
according  to  its  observations.  Let  pr{t)  be  the  probability  that  a  robot  is  in  the 
Red  state  at  time  t.  The  equation  governing  its  evolution  is 


dpr 

dt 


£•(1  Pr)fg—>r  &Prfr—>g 


(2) 


where  e  is  the  rate  at  which  the  robot  makes  decisions  to  switch  its  state, 

and  f g and  fr^g  are  the  corresponding  transitions  probabilities  between  the 

states.  As  explained  above,  these  probabilities  depend  on  the  robot’s  history  — 
the  number  of  tasks  of  either  type  it  has  observed  during  the  time  interval  h  pre¬ 
ceding  the  transition.  If  the  robots  have  global  knowledge  about  the  numbers 
of  Red  and  Green  tasks  Mr  and  Mg,  one  could  choose  the  transition  proba¬ 
bilities  as  the  fraction  of  tasks  of  corresponding  type,  fg^r  oc  Mr/(Mr  +  Mg) 

and  f r _ ,g  oc  Mg/(Mr  +  Mg).  In  the  case  when  the  global  information  is  not 

available,  it  is  natural  to  use  similar  transition  probabilities  using  robots’  local 
estimates: 
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Note  that  7 r(mr,  mg)  +  7 g(mr,  mg)  =  1  whenever  mr  +  mg  >  0,  e.g .,  whenever 
there  is  at  least  one  observation  in  the  history  window.  In  the  case  when  there 
are  no  observations  in  history,  mr  =  rng  =  0,  robots  will  choose  either  state  with 
probability  1/2  as  it  follows  from  taking  the  appropriate  limits  in  Equations  3 
and  4.  Hence,  we  supplement  Equation  3  with  /g^r(0,0)  =  /r^g(0,0)  =  0  (and 
similarly  for  Equation  4)  to  assure  that  robots  do  not  change  their  state  when 
the  history  window  does  not  contain  any  observations. 

Equation  2,  together  with  the  transition  rates  shown  in  Equations  3-4,  deter¬ 
mines  the  evolution  of  the  probability  density  of  a  robot’s  state.  It  is  a  stochastic 
equation  since  the  coefficients  (transition  rates)  depend  on  random  variables  mr 
and  mg.  Moreover,  since  the  robot’s  history  changes  gradually,  the  values  of  the 
coefficients  at  different  times  are  correlated,  hence  making  the  exact  treatment 
very  difficult.  We  propose,  instead,  to  study  the  problem  within  the  annealed 


approximation:  we  neglect  time-correlation  between  robot’s  histories  at  differ¬ 
ent  times,  assuming  instead  that  at  any  time  the  real  history  {mr,mg}  can 
be  replaced  by  a  random  one  drawn  from  the  Poisson  distribution  Equation  1. 
Next,  we  average  Equation  2  over  all  histories  to  obtain 


dpr  -  n  s 
—  =  £7r(l  -  nr)  -  £7 gnr 


(5) 


Here  7,,  and  jg  are  given  by 


7r  =Y,P^9)yir^g 

r,g  ^ 


Y  p(r’  3) 

r,g 
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where  P(mr,mg)  is  the  Poisson  distribution  Equation  1  and  the  summation 
excludes  the  term  r  =  g  =  0.  Note  that  if  the  distribution  of  tasks  changes  in 
time,  then  7  are  time-dependent,  7  =  7 ,%g(t). 

To  proceed  further,  we  need  to  evaluate  the  summations  in  Equation  6.  Let 
us  define  an  auxiliary  function 


F(x ) 
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It  is  easy  to  check  that  7  are  given  by 

%  =  F(l)-ip(0,0)  =  F(l)-^e' 

7g  =  l-P(l)~e^M° 
Differentiating  Equation  7  with  respect  to  x  yields 


dF 

dx 
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Note  that  the  summation  over  mr  starts  from  mr  =  1.  Clearly,  the  sums  over  mr 
and  mg  are  de-coupled  thanks  to  the  cancellation  of  the  denominator  ( mr+mg ): 


dF 

dx 


(xAg)^\ 
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The  resulting  sums  are  evaluated  easily  (as  the  Taylor  expansion  of  correspond¬ 
ing  exponential  functions),  and  the  results  is 


_  —  \  p-Ao(l-x) 

dx 


(11) 


where  Ao  =  Xr  +  Xg. 
the  condition  P(0)  = 

Xr,  Aq: 


After  elementary  integration  of  Equation  11  (subject  to 
1/2),  we  obtain  using  Equation  9  and  the  expressions  for 


7  r,«W  = 
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Here  pr,g(t )  =  Mr  g(t) /M0  are  the  fraction  of  Red  and  Green  tasks  respectively. 

Let  us  first  consider  the  case  when  the  task  distribution  does  not  change 
with  time,  i.e.,  pr(t)  =  po-  Then  we  have 

%,g(t)  =  (l-e-ahM°)£tg  (13) 

The  solution  of  Equation  5  subject  to  the  initial  condition  pr(t  =  0)  =  po  is 
readily  obtained: 

Pr(t)  =  p°r  +  (p0  -  e-^+J9)t  (14) 

V  7r+7s/ 

One  can  see  that  the  probability  distribution  approaches  the  desired  steady 
state  value  p sr  =  p®  exponentially.  Also,  the  coefficient  of  the  exponent  depends 
on  the  density  of  tasks  and  the  length  of  the  history  window.  Indeed,  it  is 
easy  to  check  that  7r  +  7S  =  1  —  Hence,  for  large  enough  Mq  and 

h,  ahMo  1,  the  convergence  rate  is  determined  solely  by  e.  For  a  small 
task  density  or  short  history  length,  on  the  other  hand,  the  convergence  rate  is 
proportional  to  the  number  of  tasks,  e(l  —  e~ahM° )  ~  eahM0.  Note  that  this  is 
a  direct  consequence  of  the  rule  that  robots  do  not  change  their  state  whenever 
there  are  no  observation  in  the  history  window. 

Now  let  us  consider  the  case  where  the  task  distribution  changes  suddenly 
at  time  to,  pr{t)  =  p®  +  A p9{t  —  to),  where  9(t  —  to)  is  the  step  function. 
For  simplicity,  let  us  assume  that  ahMo  ^  1  so  that  the  exponential  term  in 
Equation  12  can  be  neglected, 

1  [f 

7r,g(i)  =  h  Jf  f  dt'hr,g(t),  Jr(t)  +  7ff  =  1  (15) 

Replacing  Equation  15  into  Equation  5,  and  solving  the  resulting  differential 
equation  yields 

Pr(t)  =  pV+^t-^i  l^e~st),  t<h 

pr(t)  =  p°r  +  Ap-^(e-£d-V-e-et),  t>h.  (16) 

eh 

Equation  16  describes  how  the  robot  distribution  converges  to  the  new  steady 
state  value  after  the  change  in  task  distribution.  Clearly,  the  convergence  prop¬ 
erties  of  the  solutions  depend  on  h  and  e.  It  is  easy  to  see  that  in  the  limiting 
case  e/i>l  the  new  steady  state  is  attained  after  time  h,  \pr(h)  —  (po  +  A p)  \  ~ 
A p/(eh)  <C  1,  so  the  convergence  time  is  tconv  ~  h.  In  the  other  limiting  case 
eh  <C  1,  on  the  other  hand,  the  situation  is  different.  A  simple  analysis  of 
Equation  16  for  t  >  h  yields  \pr(t)  —  (p0  +  Ay/.)  |  ~  A pe~et  so  the  convergence 
is  exponential  with  characteristic  time  tconv  ~  1/e. 

4.1.3  Collective  Behavior 

In  order  to  make  predictions  about  the  behavior  of  an  MRS  using  a  dynamic 
task  allocation  mechanism,  we  need  to  develop  a  mathematical  model  of  the 


collective  behavior  of  the  system.  In  the  previous  section  we  derived  a  model  of 
how  an  individual  robot’s  behavior  changes  in  time.  In  this  section  we  extend 
it  to  model  the  behavior  of  a  MRS.  In  particular,  we  study  the  collective  behav¬ 
ior  of  a  homogenous  system  consisting  of  N  robots  with  identical  controllers. 
Mathematically,  the  MRS  is  described  by  a  probability  density  function  that 
includes  the  states  of  all  N  robots.  However,  in  most  cases  we  are  interested 
in  studying  the  evolution  of  global,  or  average,  quantities,  such  as  the  average 
number  of  robots  in  the  Red  state,  rather  than  the  exact  probability  density 
function.  This  applies  when  comparing  theoretical  predictions  with  results  of 
experiments,  which  are  usually  quoted  as  an  average  over  many  experiments. 
Since  the  robots  in  either  state  are  independent  of  each  other,  pr(t ),  is  now 
the  fraction  of  robots  in  the  Red  state,  and  consequently  Npr(t)  is  the  average 
number  of  robots  in  that  state.  The  results  of  the  previous  section,  namely 
solutions  for  pr(t)  for  constant  task  distribution  (Equation  14)  and  for  chang¬ 
ing  task  distribution  (Equation  16),  can  be  used  to  study  the  average  collective 
behavior.  Section  6.1  presents  results  of  analysis  of  the  mathematical  model. 

4.1.4  Stochastic  Effects 

In  some  cases  it  is  useful  to  know  the  probability  distribution  of  robot  task 
states  over  the  entire  MRS.  This  probability  function  describes  the  exact  col¬ 
lective  behavior  from  which  one  could  derive  the  average  behavior  as  well  as 
the  fluctuations  around  the  average.  Knowing  the  strength  of  fluctuations  is 
necessary  for  assessing  how  the  probabilistic  nature  of  robot’s  observations  and 
actions  affects  the  global  properties  of  the  system.  Below  we  consider  the  prob¬ 
lem  of  finding  the  probability  distribution  of  the  collective  state  of  the  system. 

Let  Pn(t)  be  the  probability  that  there  are  exactly  n  robots  in  the  Red  state 
at  time  t.  For  a  sufficiently  short  time  interval  At  we  can  write  [15] 

Pn(t  +  At)  =  53  Wn'nft  At)P„,(t)  -  ^  Wnn>(t]  A t)Pn(t)  (17) 

n'  n' 

where  Wl:j(t:;  At)  is  the  transition  probability  between  the  states  i  and  j  during 
the  time  interval  (t,t  +  At).  In  our  MRS,  this  transitions  correspond  to  robots 
changing  their  state  from  Red  to  Green  or  vice  versa.  Since  the  probability 
that  more  than  one  robot  will  have  a  transition  during  a  time  interval  At  is 
O(At),  then,  in  the  limit  At  — >  0  only  transitions  between  neighboring  states 
are  allowed  in  Equation  17,  n  — >  n  ±  1.  Hence,  we  obtain 

dP 

=  rn+1Pn+i(t)  +  gn-iPn-i(t)  -  (r„  +  gn)Pn(t) .  (18) 

Here  rk  is  the  probability  density  of  having  one  of  the  k  Red  robots  change  its 
state  to  Green,  and  g k  is  the  probability  density  of  having  one  of  the  N  —  k 
Green  robots  change  its  state  to  Red.  Let  us  assume  again  that  ahM0  ^  1  so 
that  7fl  =  1  —  7r.  Then  one  has 


rk  =  k(  1  -  7r)  ,  gk  =  (N  -  k) 7, 


(19) 


with  ro  =  ff-i  =  0,  Tn+i  =  9n  =  0.  7r  is  history-averaged  transition  rate  to 
Red  states. 

The  steady  state  solution  of  Equation  18  is  given  by  [10] 
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where  Pg  is  determined  by  the  normalization: 
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Using  the  expression  for  7,  after  some  algebra  we  obtain 
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(21) 


(22) 


e.g.,  the  steady  state  is  a  binomial  distribution  with  parameter  7.  Note  again 
that  this  is  a  direct  consequence  of  the  independence  of  the  robots’  dynamics. 
Indeed,  since  the  robots  act  independently,  in  the  steady  state  each  robot  has 
the  same  probability  of  being  in  either  state.  Moreover,  using  this  argument  it 
becomes  clear  that  the  time-dependent  probability  distribution  Pn(t )  is  given 
by  Equation  22  with  7  replaced  by  pr(t),  Equation  14. 


4.2  Observations  of  Tasks  and  Robots 

In  this  section  we  study  the  more  complex  dynamic  task  allocation  mechanism 
in  which  robots  make  decisions  to  change  their  state  based  on  the  observations 
of  not  only  available  tasks  but  also  on  the  observed  task  states  of  other  robots. 
Specifically,  each  robot  now  records  the  numbers  and  types  of  task  as  well  as 
the  numbers  and  task  types  of  robots  it  has  encountered.  Again,  we  let  mr 
and  mg  be  the  number  of  tasks  of  Red  and  Green  type,  and  nr  and  ng  be  the 
number  of  robots  in  Red  and  Green  task  state  in  a  robot’s  history  window. 
The  probabilities  for  changing  a  robot’s  state  are  again  given  by  transition 
functions  that  now  depend  on  the  fractions  of  observed  tasks  and  robots  of 
each  type:  mr  =  mr/{mr  +  mg),  rhg  =  mg/(mr  +  mg),  nr  =  nr/{nr  +  ng), 
and  ng  =  ng/(nr  +  ng).  In  our  previous  work  [15]  we  showed  that  in  order  to 
achieve  the  desired  long  term  behavior  for  task  allocation  (i.  e.,  in  the  steady 
state  the  average  fraction  of  Red  and  Green  robots  is  equal  to  the  fraction  of 
Red  and  Green  tasks  respectively),  the  transition  rates  must  have  the  following 
functional  form: 

fg^r(mr,hr)  =  rhrg(mr  -  hr),  (23) 

fr — >g (fhr ,  hr)  =  ihgg(m.g  -  hg)  =  (1  -  ihr)g{-rhr  +  hr).  (24) 

Here  g(z)  is  a  continuous,  monotonically  increasing  function  of  its  argument 
defined  on  an  interval  [—1,1].  In  this  paper  we  consider  the  following  forms  for 


•  Power:  g(z)  =  1002/100 

•  Stepwise  linear:  g{z)  =  zQ(z).1 

To  analyze  this  task  allocation  model,  let  us  again  consider  a  single  robot 
that  searches  for  tasks  to  perform  and  makes  a  transition  to  Red  and  Green 
states  according  to  transition  functions  defined  above.  Let  pr(t)  be  the  proba¬ 
bility  that  the  robot  is  in  the  Red  state  at  time  t,  with  Equation  2  governing 
its  time  evolution.  Note  that  pr(t)  is  also  the  average  fraction  of  Red  robots, 
Pr(t)  =  Nr{t)/N. 

As  in  the  previous  case,  the  next  step  of  the  analysis  is  averaging  over  the 
the  robot’s  histories,  i.e.,  mr  and  hr.  Note  that  a  robot’s  observations  of  avail¬ 
able  tasks  can  still  be  modeled  by  a  Poisson  distribution  similar  to  Equation  1. 
However,  since  the  number  of  robots  of  each  task  state  changes  stochastically 
in  time,  the  statistics  of  nr  and  ng  should  be  modeled  as  a  doubly  stochas¬ 
tic  Poisson  process  (also  called  Cox  process)  with  stochastic  rates.  This  would 
complicate  the  calculation  of  the  average  over  hr  =  nr/(nr  +  ng)  and  require 
mathematical  details  that  go  well  beyond  the  scope  of  this  paper.  Fortunately, 
as  we  demonstrated  in  the  previous  section,  if  a  robot’s  observation  window 
contains  many  readings,  then  the  estimated  fraction  of  task  types  is  exponen¬ 
tially  close  to  the  average  of  the  Poisson  distribution.  This  suggests  that  for 
sufficiently  high  densities  of  tasks  and  robots  we  can  neglect  the  stochastic  ef¬ 
fects  of  modeling  observations  for  the  purpose  of  our  analysis,  and  replace  the 
robot’s  observation  by  their  average  (expected)  values.  In  other  words,  we  use 


the  following  approximation: 

i  /■*  , 

flr  £ 

s  -  /  pr(t)dt 
n  Jt-h 

(25) 

1  /'*  , 

mr  £ 

s  -  /  pr(t  )dt  . 
n  Jt-h 

(26) 

The  Equations  2,  25,  and  26  are  a  system  of  integro-differential  equations 
that  uniquely  determine  the  dynamics  of  pr(t).  In  the  most  general  case  it  is 
not  possible  to  obtain  solutions  by  analytical  means,  hence  one  has  to  solve  the 
system  numerically.  However,  if  the  task  density  does  not  change  in  time,  we 
can  still  perform  steady  state  analysis.  Steady  state  analysis  looks  for  long-term 
solutions  that  do  not  change  in  time,  i.e.,  dpr/dt  =  0.  Let  g®  be  the  density 
of  Red  tasks,  and  po  =  pr(t  — »  oo)  be  the  steady  state  value,  so  that  mr  =  g®, 
hr  =  Pr-  Then,  by  setting  left  hand  side  of  Equation  2  to  zero,  we  get 

(1  -  Po)p°rg(Pr  -  Po)  =  Po(l  -  hr)g(-hr  +  P0)  (27) 

Note  that  po  =  g°r  is  a  solution  to  Equation  27  so  that  in  the  steady  state 
the  fraction  of  Red  robots  equals  the  fraction  of  red  tasks  as  desired.  To  show 
that  this  is  the  only  solution,  we  note  that  for  a  fixed  g®  the  right-  and  left-hand 

1The  step  function  0  is  defined  as  (~)(z)  =  1  if  z  >  0;  otherwise,  it  is  0.  The  step  function 
guarantees  that  no  transitions  to  Red  state  occur  when  mr  <  nr. 


sides  of  the  equation  are  monotonically  increasing  and  decreasing  functions  of 
Po  respectively,  due  to  the  monotonicity  of  g(z).  Consequently,  the  two  curves 
can  meet  only  once  and  that  proves  the  uniqueness  of  the  solution. 

4.2.1  Phenomenological  Model 

Exact  stochastic  models  of  task  allocation  can  quickly  become  analytically  in¬ 
tractable,  as  we  saw  above.  Instead  of  exact  models,  it  is  often  more  conve¬ 
nient  to  work  with  the  so-called  Rate  Equations  model.  These  equations  can 
be  derived  from  the  exact  stochastic  model  by  appropriately  averaging  it  [15]; 
however,  they  are  often  (see,  for  example,  population  dynamics  [6])  phenomeno¬ 
logical,  or  ad  hoc,  in  nature  —  constructed  by  taking  into  account  the  system’s 
salient  processes.  This  approach  makes  a  number  of  simplifying  assumptions: 
namely,  that  the  system  is  uniform  and  dilute  (not  too  dense),  that  actions  of 
individual  entities  are  independent  of  one  another,  that  parameters  can  be  rep¬ 
resented  by  their  mean  values  and  that  system  behavior  can  be  described  by  its 
average  value.  Despite  these  simplifications,  resulting  models  have  been  shown 
to  correctly  describe  dynamics  of  collective  behavior  of  robotic  systems  [18]. 
Phenomenological  models  are  useful  for  answering  many  important  questions 
about  the  performance  of  a  MRS,  such  as,  does  the  steady  state  exist,  how  long 
does  it  take  to  reach  it,  and  so  on.  Below  we  present  a  phenomenological  model 
of  dynamic  task  allocation. 

Individual  robots  are  making  their  decisions  to  change  task  state  probabilis¬ 
tically  and  independently  of  one  another.  A  robot  will  change  state  from  Green 
to  Red  with  probability  fg—>r  and  with  probability  1  —  fg—>r  it  will  remain  in 
the  Green  state.  We  can  succinctly  write  A Ng^r  and  A Nr_>g,  the  number  of 
robots  that  switch  from  Green  to  Red  and  vice  versa  during  a  sufficiently  small 
time  interval  At,  as 

Ng 

A Ng^r  =  Ya ',i(fg->r6{Xj  -  1)  +  (1  -  fg->r)S( Xj)) 
i= 1 
Nr 

A  Nr_g  =  Yyi- Xi)(fr^gS(Xj)  +  (1  -  fr^g)S(Xj  -  1))  ■ 

2—1 

Here  we  introduced  a  state  variable  a y,  such  that  ay  =  1  when  a  robot  is  in  the 
Green  state,  and  ay  =  0  when  a  robot  is  in  the  Red  state.  5(x)  is  Kronecker 
delta,  defined  as  6(x)  =  1  when  x  =  0  and  S(x)  =  0  otherwise.  Therefore, 
A Ng_,r  is  a  random  variable  from  a  binomial  distribution  specified  by  a  mean 
p  =  fg^rNg  and  variance  a2  =  fg^r(l  —  fg^r)Ng.  Similarly,  the  distribution 
of  the  random  variable  A Nr^g  is  specified  by  mean  p  =  fr^gNr  and  variance 
a2  =  f f  — ,g ( 1  -  fr^g)Nr. 

During  a  time  interval  At  the  total  number  of  robots  in  Red  and  Green  task 
states  will  change  as  individual  robots  make  decisions  to  change  states.  The 
following  finite  difference  equation  specifies  how  the  number  of  Red  will  change 


on  average: 


Nr(t  +  At)  =  Nr(t)  +  eANg^rAt  -  eANr^gAt  (28) 

Rearranging  the  equation  and  taking  the  continuous  time  limit  (At  — >  0)  yields 
a  differential  Rate  Equation  that  describes  time  evolution  of  the  number  of  Red 
robots.  By  taking  the  means  of  AiV’s  as  their  values,  we  recover  Equation  2. 

Keeping  AN’s  as  random  variables  allows  us  to  study  the  effect  the  prob¬ 
abilistic  nature  of  the  robots’  decisions  have  on  the  collective  behavior.2  We 
solve  Equation  28  by  iterating  it  in  time  and  drawing  AN's  at  random  from 
their  respective  distributions.  The  solutions  are  subject  to  the  initial  condition 
Nr(t  <  0)  =  N  and  specify  the  dynamics  of  task  allocation  in  robots. 

Functions  fg^r  and  fr^g  are  calculated  using  estimates  of  the  densities  of 
Red  tasks  (mr)  and  robots  in  Red  state  (nr)  from  the  observed  counts  stored 
in  the  robot’s  history  window. 

Transition  rates  fg—>r  and  fr^g  in  the  model  are  mean  values,  averaged  over 
all  histories  and  all  robots.  In  order  to  compute  them,  we  need  to  aggregate 
observations  of  all  robots.  Suppose  each  robot  has  a  history  window  of  length 
h.  For  a  particular  robot  i,  the  values  in  the  most  recent  observational  slot  are 
Nf  ,  Nf  ,  Mf  r  and  M?  ,  the  observed  numbers  of  Red  and  Green  robots  and 
tasks  respectively  at  time  t.  In  the  next  latest  slot,  the  values  are  N}r,  N}  , 
M\r  and  M)1  g ,  the  observed  numbers  at  time  t  —  A,  and  so  on.  Each  robot 
estimates  the  densities  of  Red  robots  and  tasks  using  the  following  calculation: 
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When  observations  of  all  robots  are  taken  into  account,  the  mean  of  the 
observed  densities  of  Red  robots  at  time  t  —  ^  J2iLi  ni,r  —  wid  fluctuate  due 
to  observation  noise,  but  on  average  it  will  be  proportional  to  Nr(t)/N,  which 
is  the  actual  density  of  Red  robots  at  time  t.  The  proportionality  factor  is 
related  to  physical  robot  parameters,  such  as  speed  and  observation  area  (see 
Section  6.1).  Likewise,  the  average  of  the  observed  densities  at  time  t  —  j A  is 
YliLi  nir  ^  Nr(t  —  jA)/N,  the  density  of  robots  at  time  t  —  j  A.  Thus,  the 
aggregate  estimates  of  the  fractions  of  Red  robots  and  tasks  are: 


1  JV  j  h-1 

flr  =  =  —  ^Ar(f- jA)  (31) 

i= 1  3=0 

N  ^  h-1 

mr  =  —  ^miir  =  —  jA)  (32) 

i= 1  3=0 

2 Note  that  we  do  not  model  here  the  effect  of  observation  noise  due  to  uncertainty  in  sensor 
readings  and  fluctuations  in  the  distribution  of  tasks. 


Robots  are  making  their  decisions  asynchronously,  i.  e.,  at  slightly  different 
times.  Therefore,  the  last  terms  in  the  above  equations  are  best  expressed  in 
continuous  form:  e.g.,  1/Nh /°  Nr(t  —  r)dr  (see  Equation  25  and  Equation  26). 

Estimates  Equation  31  and  32  can  be  plugged  into  Equation  23  and  Equa¬ 
tion  24  to  compute  the  values  of  transition  probabilities  for  any  choice  of  the 
transition  function  (power  or  linear).  Once  we  know  /r_>g  and  /g_>r,  we  can 
solve  Equation  28  to  study  the  dynamics  of  task  allocation  in  robots.  Note  that 
Equation  28  is  now  a  time-delay  finite  difference  equation,  and  solutions  will 
show  typical  oscillations. 

We  solve  the  models  presented  in  this  section  and  validate  their  predictions 
in  context  of  the  multi-foraging  task  described  next. 

5  Multi-Robot  Multi-Foraging  Task 

In  this  section  we  describe  the  multi-foraging  task  domain  in  which  we  exper¬ 
imentally  tested  our  dynamic  task  allocation  mechanism,  including  the  simu¬ 
lation  environment  used  and  robot  sensing  and  control  characteristics.  In  Sec¬ 
tion  6.1  we  use  this  application  to  validate  the  models  presented  above,  solve 
them  and  compare  their  solutions  to  the  results  of  embodied  simulations. 

5.1  Task  Description 

The  traditional  foraging  task  is  defined  by  having  an  individual  robot  or  group 
of  robots  collect  a  set  of  objects  from  an  environment  and  either  consume  on 
the  spot  or  return  them  to  a  common  location  [5].  Multi-foraging,  a  variation 
on  traditional  foraging,  is  defined  in  [2]  and  consists  of  an  arena  populated  by 
multiple  types  of  objects  to  be  concurrently  collected. 

In  our  multi-foraging  domain,  there  are  two  types  of  objects  (e.g.,  pucks) 
randomly  dispersed  throughout  the  arena:  Puck/je(j  and  PuckQreer!  pucks  that 
are  distinguishable  by  their  color.  Each  robot  is  equally  capable  of  foraging 
both  puck  types,  but  can  only  be  allocated  to  foraging  for  one  type  at  any 
given  time.  Additionally,  all  robots  are  engaged  in  foraging  at  all  times;  a  robot 
cannot  be  idle.  A  robot  may  switch  the  puck  type  for  which  it  is  foraging 
according  to  its  control  policy,  when  it  determines  it  is  appropriate  to  do  so. 
This  is  an  instantiation  of  the  general  task  allocation  problem  described  earlier 
in  this  paper,  with  puck  colors  representing  different  task  types. 

In  the  multi-foraging  task,  the  robots  move  in  an  enclosed  arena  and  pick  up 
encountered  pucks.  When  a  robot  picks  up  a  puck,  the  puck  is  consumed  (i.e., 
it  is  immediately  removed  from  the  environment,  not  transported  to  another 
region)  and  the  robot  carries  on  foraging  for  other  pucks.  Immediately  after 
a  puck  is  consumed,  another  puck  of  the  same  type  is  placed  in  the  arena  at 
a  random  location.  This  is  done  so  as  to  maintain  a  constant  puck  density 
in  the  arena  throughout  the  course  of  an  experiment.  In  some  situations,  the 
density  of  pucks  can  impact  the  accuracy  or  speed  of  convergence  to  the  desired 
task  allocation.  This  is  an  important  consideration  in  dynamic  task  allocation 


mechanisms  for  many  domains;  however,  in  this  work  we  want  to  limit  the 
number  of  experimental  variables  impacting  system  performance.  Therefore, 
we  reserve  the  investigation  on  the  impact  of  varying  puck  densities  for  future 
work. 

The  role  of  dynamic  task  allocation  in  this  domain  requires  the  robots  to 
split  their  numbers  by  having  some  forage  for  Puck#e£2  pucks  and  others  for 
Puck  Green  pucks.  For  the  purpose  of  our  experiments,  we  desire  an  allocation 
of  robots  to  converge  to  a  situation  in  which  the  proportion  of  robots  foraging 
for  Prickled  pucks  is  equal  to  the  proportion  of  Puck^^  pucks  present  in  the 
foraging  arena  (e.g.,  if  Puck#e(2  pucks  make  up  30%  of  the  pucks  present  in  the 
foraging  arena,  then  30%  of  the  robots  should  be  foraging  for  Puck^,^  pucks). 
In  general,  the  desired  allocation  could  take  other  forms.  For  example,  it  could 
be  related  to  the  relative  reward  or  cost  of  foraging  each  puck  type  without 
change  to  our  approach. 

We  note  that  the  limited  sensing  capabilities  and  lack  of  direct  communica¬ 
tion  of  the  individual  robots  in  the  implementation  of  our  task  domain  prohibits 
them  from  acquiring  global  information  such  as  the  size  and  shape  of  the  for¬ 
aging  arena,  the  initial  or  current  number  of  pucks  to  be  foraged  (total  or  by 
type),  or  the  initial  or  current  number  of  foraging  robots  (total  or  by  foraging 
type). 

5.2  Simulation  Environment 

In  order  to  experimentally  demonstrate  the  dynamic  task  allocation  mecha¬ 
nism  we  made  use  of  a  physically-realistic  simulation  environment.  Our  simu¬ 
lation  trials  were  performed  using  Player  and  Gazebo  simulation  environments. 
Player  [3]  is  a  server  that  connects  robots,  sensors,  and  control  programs  over 
a  network.  Gazebo  [12]  simulates  a  set  of  Player  devices  in  a  3-D  physically- 
realistic  world  with  full  dynamics.  Together,  the  two  represent  a  high-fidelity 
simulation  tool  for  individual  robots  and  teams  that  has  been  validated  on  a 
collection  of  real-robot  robot  experiments  using  Player  control  programs  trans¬ 
ferred  directly  to  physical  mobile  robots.  Figure  1  provides  snapshots  of  the 
simulation  environment  used.  All  experiments  involved  20  robots  foraging  in  a 
400m2  arena. 

The  robots  used  in  the  experimental  simulations  are  realistic  models  of  the 
ActivMedia  Pioneer  2DX  mobile  robot.  Each  robot,  approximately  30  cm  in 
diameter,  is  equipped  with  a  differential  drive,  an  odometry  system  using  wheel 
rotation  encoders,  and  180  degree  forward-facing  laser  rangefinder  used  for  ob¬ 
stacle  avoidance  and  as  a  fiducial  detector/reader.  Each  puck  is  marked  with 
a  fiducial  that  marks  the  puck  type  and  each  robot  is  equipped  with  a  fiducial 
that  marks  the  active  foraging  state  of  the  robot.  Note  that  the  fiducials  do  not 
contain  unique  identities  of  the  pucks  or  robots  but  only  mark  the  type  of  the 
puck  or  the  puck  type  a  given  robot  is  engaged  in  foraging.  Each  robot  is  also 
equipped  with  a  2-DOF  gripper  on  the  front,  capable  of  picking  up  a  single  8 
cm  diameter  puck  at  a  time.  There  is  no  capability  available  for  explicit,  di¬ 
rect  communication  between  robots  nor  can  pucks  and  other  robots  be  uniquely 


Figure  1:  Snapshots  from  the  simulation  environment  used,  (left)  An  overhead 
view  of  foraging  arena  and  robots,  (right)  A  closeup  of  robots  and  pucks. 


identified. 

5.3  Behavior-Based  Robot  Controller 

All  robots  have  identical  behavior-based  controllers  consisting  of  the  following 
mutually  exclusive  behaviors:  Avoiding,  Wandering,  Puck  Servoing,  Grasping, 
and  Observing.  Descriptions  of  robot  behaviors  are  provided  below. 

-  The  Avoiding  behavior  causes  the  robot  to  turn  to  avoid  obstacles  in  its 
path. 

-  The  Wandering  behavior  causes  the  robot  to  move  forward  and,  after  a 
random  length  of  elapsed  time,  to  turn  left  or  right  through  a  random  arc 
for  a  random  period  of  time. 

-  The  Puck  Servoing  behavior  causes  the  robot  to  move  toward  a  detected 
puck  of  the  desired  type.  If  the  robot’s  current  foraging  state  is  Robot Red, 
the  desired  puck  type  is  Puck#ed,  and  if  the  robots  current  foraging  state 
is  Robotoeern  the  desired  puck  type  is  Pucko.een. 

-  The  Grasping  behavior  causes  the  robot  to  use  its  gripper  to  pick  up  and 
consume  a  puck  within  the  gripper’s  grasp. 

-  The  Observing  behavior  causes  the  robot  to  take  the  current  fiducial 
information  returned  by  the  laser  rangefinder  and  record  the  detected 
pucks  and  robots  to  their  respective  histories.  The  robot  then  updates 
its  foraging  state  based  on  those  histories.  A  description  of  the  histories 
is  given  in  Section  5.3.1  and  a  description  of  the  foraging  state  update 
procedure  is  given  in  Section  5.3.2. 

Each  behavior  listed  above  has  a  set  of  activation  conditions  based  on  rel¬ 
evant  sensor  inputs  and  state  values.  When  met,  the  conditions  cause  the  be¬ 
havior  to  be  become  active.  A  description  of  when  each  activation  condition  is 
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Table  1:  Behavior  Activation  Conditions.  Behaviors  are  listed  in  order  of  de¬ 
creasing  rank.  Higher  ranking  behaviors  preempt  lower  ranking  behaviors  in 
the  event  multiple  are  active.  X  denotes  the  activation  condition  is  irrelevant 
for  the  behavior. 


active  is  given  below.  The  activation  conditions  of  all  behaviors  are  shown  in 
Table  1. 

-  The  Obstacle  Detected  activation  condition  is  true  when  an  obstacle 
is  detected  by  the  laser  rangefinder  within  a  distance  of  1  meter.  Other 
robots,  pucks,  and  the  arena  walls  are  considered  obstacles. 

-  The  Puck/)et  Detected  activation  condition  is  true  if  the  robot’s  current 
foraging  state  is  Robot r>et  and  a  puck  of  type  Puck0et  (where  Det  is  Red 
or  Green)  is  detected  within  a  distance  of  5  meters  and  within  ±  30  degrees 
of  the  robot’s  direction  of  travel. 

-  The  Gripper  Break-Beam  On  activation  condition  is  true  if  the  break- 
beam  sensor  between  the  gripper  jaws  detects  an  object. 

-  The  Observation  Signal  activation  condition  is  true  if  the  distance  trav¬ 
eled  by  the  robot  according  to  odometry  since  the  last  time  the  Observing 
behavior  was  activated  is  greater  than  2  meters. 

5.3.1  Robot  State  Information 

All  robots  maintain  three  types  of  state  information:  foraging  state,  observed 
puck  history,  and  observed  robot  history.  The  foraging  state  identifies  the  type 
of  puck  the  robot  is  currently  involved  in  foraging.  A  robot  with  a  foraging 
state  of  Robotfied  refers  to  a  robot  engaged  in  foraging  Puck^e(j  pucks  and  a 
foraging  state  of  Robotoeen  refers  to  a  robot  engaged  in  foraging  Puck<3reen 
pucks.  For  simplicity,  we  will  refer  to  both  robot  foraging  states  and  puck  types 
as  Red  and  Green.  The  exact  meaning  will  be  clear  in  context. 

Each  robot  is  outfitted  with  a  colored  beacon  passively  observable  by  nearby 
robots  which  indicates  the  robot’s  current  foraging  state.  The  color  of  the  bea¬ 
con  changes  to  reflect  the  current  state  -  a  red  beacon  for  a  foraging  state  of 
Red  and  a  green  beacon  for  foraging  state  Green.  Thus,  the  colored  beacon 
acts  as  a  form  of  local,  passive  communication  conveying  the  robot’s  current 


foraging  state.  All  robots  maintain  a  limited,  constant-sized  history  storing  the 
most  recently  observed  puck  types  and  another  constant-sized  history  storing 
the  foraging  state  of  the  most  recently  observed  robots.  Neither  of  these  his¬ 
tories  contains  a  unique  identity  or  location  of  detected  pucks  or  robots,  nor 
does  it  store  a  time  stamp  of  when  any  given  observation  was  made.  The  his¬ 
tory  of  observed  pucks  is  limited  to  the  last  MAX-PUCK-HI  STORY  pucks  observed 
and  the  history  of  the  foraging  states  of  observed  robots  is  limited  to  the  last 
MAX-ROBOT-HISTORY  robots  observed. 

While  moving  about  the  arena,  each  robot  keeps  track  of  the  approximate 
distance  it  has  traveled  by  using  odometry  measurements.  At  every  interval  of 
2  meters  traveled,  the  robot  makes  an  observation  performed  by  the  Observing 
behavior.  This  procedure  is  nearly  instantaneous;  therefore,  the  robot’s  behavior 
is  not  outwardly  affected.  The  area  in  which  pucks  and  other  robots  are  visible  is 
within  5  meters  and  ±  30  degrees  in  the  robot’s  direction  of  travel.  Observations 
are  only  made  after  traveling  2  meters  because  updating  too  frequently  leads 
to  over-convergence  of  the  estimated  puck  and  robot  type  proportions  due  to 
repeated  observations  of  the  same  pucks  and/or  robots.  On  average,  during  our 
experiments,  a  robot  detected  2  pucks  and  robots  per  observation. 

5.3.2  Foraging  State  Transition  Function 

After  a  robot  makes  an  observation,  it  re-evaluates  and  probabilistically  changes 
its  current  foraging  state  given  the  newly  updated  puck  and  robot  histories.  The 
probability  by  which  the  robot  changes  its  foraging  state  is  defined  by  the  tran¬ 
sition  function.  We  experimentally  studied  transition  functions  given  by  Equa¬ 
tion  3,  Equation  23  and  Equation  24  with  both  power  and  linear  forms.  Below 
we  present  results  of  analysis  and  simulations  and  discuss  the  consequences  the 
choice  of  the  transition  function  has  on  system  level  behavior. 

6  Analysis  and  Simulations  Results 

The  mathematical  models  developed  in  Section  4  can  be  directly  applied  to  the 
multi-foraging  task  if  we  map  Red  and  Green  tasks  to  Red  and  Green  pucks 
and  task  states  of  robots  to  their  foraging  states.  Model  parameters,  such  as 
e,  a,  etc,  depend  on  physical  realizations  of  the  implementation  and  can  be 
computed  from  details  of  the  multi-foraging  task  as  described  below. 

6.1  Observations  of  Pucks  Only 

First,  we  study  the  model  of  dynamic  task  allocation,  presented  in  Section  4.1, 
where  robots  observe  only  pucks  and  make  decision  to  switch  foraging  state  ac¬ 
cording  to  the  transition  functions  given  by  Equation  3.  We  compared  theoret¬ 
ical  predictions  of  the  robots’  collective  behavior  with  results  from  simulations. 
We  used  Equation  14  and  16  to  compute  how  the  average  number  of  robots  in 
the  Red  state  changes  in  time  when  the  puck  distribution  is  suddenly  changed. 


The  parameter  values  were  obtained  from  experiments,  pg  =  1.0  was  the  initial 
density  of  Red  robots  (of  20  total  robots),  /j,q  =  0.3  was  the  initial  Red  puck 
density  (of  50  total  pucks),  which  remained  constant  until  it  was  changed  by 
the  experimenter.  The  first  change  in  puck  density  was  A fi  =  0.5,  meaning  that 
80%  of  the  pucks  in  the  arena  are  now  Red.  The  second  change  in  puck  density 
was  A p  =  —0.3,  to  50%  Red  pucks. 

e  is  the  rate  at  which  robots  make  decisions  to  switch  states.  Robot  traveled 
2  to  between  observations  at  an  average  speed  of  0.2  m/s;  therefore,  there  are 
10  s  between  observations,  and  e  =  0.1.  h ,  the  history  length,  is  the  number 
of  pucks  in  the  robot’s  memory.  aM°  is  the  rate  at  which  robots  encounter 
pucks.  A  robot  makes  an  observation  of  its  local  environment  at  discrete  time 
intervals.  The  area  visible  to  the  robot  is  AViS  =  (5  to)2 7t/6  =  13.09,  with  1/6 
coming  from  the  60°  angle  of  view.  The  arena  area  is  A  =  315  to2;  therefore, 
aM°  =  AvisM°/A  =  2.1.  We  studied  the  dynamics  of  the  system  for  different 
history  lengths  h. 
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Figure  2:  Evolution  of  the  fraction  of  Red  robots  for  different  history  lengths. 
Robots’  decision  to  change  state  is  based  on  observations  of  pucks  only. 

Figure  2  shows  evolution  of  the  numbers  of  Red  robots  for  different  history 
lengths.  Initially,  the  distribution  of  Red  pucks  is  set  to  30%  and  all  the  robots 
are  in  the  Red  foraging  state.  At  t  =  500  s,  the  puck  distribution  changes 


abruptly  to  80%,  and  at  t  =  1000  s  to  50%.  The  solid  line  shows  results  of 
simulations  —  the  fraction  of  Red  robots,  averaged  over  10  runs.  The  dashed 
line  gives  theoretical  predictions  for  the  parameters  quoted  above.  Since  we 
are  in  the  e/i>  1  limit  (for  h  =  50, 100),  the  time  it  takes  to  converge  to  the 
steady  state  is  linear  in  history  length,  tconv  ~  h,  as  predicted  by  Equation  16. 
The  agreement  between  theoretical  and  experimental  results  is  excellent.  We 
stress  that  there  are  no  free  parameters  in  the  theoretical  predictions  —  only 
experimental  values  of  the  parameters  were  used  in  producing  these  plots. 

In  addition  to  being  able  to  predict  the  average  collective  behavior  of  the 
multi-robot  system,  we  can  also  quantitatively  characterize  the  amount  of  fluc¬ 
tuations  in  the  system.  Fluctuations  are  deviations  from  the  steady  state  (after 
the  system  has  converged  to  the  steady  state)  that  arise  from  the  stochastic 
nature  of  robot’s  observations  and  decisions.  These  deviations  result  in  fluctu¬ 
ations  from  the  desired  global  distribution  of  Red  and  Green  robots  seen  in  an 
individual  experiment.  One  can  suppress  these  fluctuations  by  averaging  results 
of  many  identical  experiments. 


f>0=30% 

I —  l0=50% 


Figure  3:  Histogram  of  the  fraction  of  Red  robots  in  the  steady  state  for  three 
different  puck  distributions  (data  for  h  =  10).  fio  specifies  fraction  of  Red  pucks. 
Lines  are  theoretical  predictions  of  the  distribution  of  Red  robots. 

To  measure  the  strength  of  the  fluctuations,  we  take  data  from  an  individual 
experimental  run  and  extract  the  fraction  of  Red  robots,  after  the  system  has 
converged  to  the  steady  state,  for  each  of  the  three  Red  puck  distributions: 
/ro  =  30%,  50%,  80%.  Because  the  runs  were  relatively  short,  we  only  have 
300  s  worth  of  data  (30  data  points)  in  the  converged  state;  however,  since  each 
experiment  was  repeated  ten  times,  we  make  the  data  sets  longer  by  appending 
data  from  all  experiments.  In  the  end,  we  have  300  measurements  of  the  steady 
state  Red  robot  density  for  three  different  puck  distributions.  Figure  6.1  shows 
the  histogram  of  robot  distributions  for  three  different  puck  distributions.  The 
solid  lines  are  computed  using  Equation  22,  where  for  7  we  used  the  actual  means 
of  the  steady  state  distributions  (7  =  0.28,  0.47  and  0.7  for  fi0  =  30%,  50%  and 
80%  respectively) .  We  can  see  from  the  plots  that  the  theory  correctly  predicts 


the  strength  of  fluctuations  about  the  steady  state.  As  is  true  of  binomial 
distributions,  the  fluctuations  (measured  by  the  variance)  are  greatest  for  cases 
where  the  numbers  of  Red  and  Green  pucks  are  comparable  (/x0  =  50%)  and 
smaller  when  their  numbers  are  very  different  (/io  =  80%). 

6.2  Observations  of  Pucks  and  Robots 

In  this  section  we  study  the  dynamic  task  allocation  model  developed  in  Sec¬ 
tion  4.2,  in  which  robots  use  observations  of  pucks  and  other  robots’  foraging 
states  to  make  decision  to  change  their  own  foraging  state. 

Figure  4  shows  results  of  embodied  simulations  (solid  lines)  as  well  as  so¬ 
lutions  to  the  model  Equation  28  (dashed  lines)  for  different  values  of  robot 
history  length  and  forms  of  transition  function  (given  by  Equation  23  and  24, 
with  g(z)  linear  or  power  function).  Initially,  the  Red  puck  fraction  (dotted  line) 
is  30%.  It  is  changed  abruptly  at  t  =  500  s  to  80%  and  then  again  at  t  =  2000  s 
to  50%.  Each  solid  line  showing  Red  robot  density  has  been  averaged  over  10 
runs.  We  rescale  the  dimensionless  time  of  the  model  by  parameter  10,  corre¬ 
sponding  e  =  0.1.  The  history  length  was  the  only  adjustable  parameter  used  in 
solving  the  equations.  The  values  of  h  used  to  compute  the  observed  fraction  of 
Red  robots  nr  in  Equation  31  were  h  =  2,  8,  16,  corresponding  to  experimental 
history  lengths  10,  50,  100  respectively.  For  mr,  the  observed  fraction  of  Red 
pucks,  we  used  their  actual  densities. 

In  order  to  explain  the  difference  in  history  lengths  between  theory  and  ex¬ 
periment,  we  note  that  in  the  simulation  experiments,  the  history  length  means 
the  numbers  of  observed  robots  and  pucks,  while  in  the  model,  it  means  the 
number  of  observations,  with  multiple  objects  sighted  within  a  single  observa¬ 
tion.  According  to  calculations  in  Section  6.1,  a  robot  observes  about  2  pucks 
in  a  single  observation.  Moreover,  the  robot  travels  2  to  between  observations, 
yet  it  sees  5  to  out  during  each  observation,  meaning  that  individual  observa¬ 
tions  will  be  correlated.  Observations  will  be  further  correlated  because  of  the 
pattern  of  a  robot’s  motion  —  as  the  robot  moves  in  a  straight  line  towards 
a  goal,  it  is  likely  to  observe  overlapping  regions  of  the  arena.  These  consid¬ 
erations  could  explain  the  factor  of  five  difference  between  the  history  lengths 
used  in  the  experiments  and  the  corresponding  values  used  in  the  model.  More 
detailed  experiments,  for  example,  ones  in  which  robots  travel  farther  between 
observations,  are  necessary  to  explain  these  differences. 

Solutions  exhibit  oscillations,  although  eventually  oscillations  decay  and  so¬ 
lutions  relax  to  their  steady  state  values.  In  all  cases,  the  steady  state  value  is 
the  same  as  the  fraction  of  red  pucks  in  the  arena.  History-induced  oscillations 
are  far  more  pronounced  for  the  linear  transition  function  (Figure  4(a))  than  for 
the  power  transition  function  (Figure  4(b)).  For  the  power  transition  function, 
these  oscillations  are  present  but  become  evident  only  for  longer  history  lengths. 
This  behavior  is  probably  caused  by  the  differences  between  the  values  of  tran¬ 
sition  functions  near  the  steady  state:  while  the  value  of  the  power  transition 
function  remains  small  near  the  steady  state,  the  value  of  the  linear  transition 
function  grows  linearly  with  the  distance  from  the  steady  state,  thereby  ampli- 
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Figure  4:  Evolution  of  the  fraction  of  Red  robots  for  different  history  lengths 
and  transition  functions,  compared  to  predictions  of  the  model 


fying  any  deviations  from  the  steady  state  solution.  The  amplitude  and  period 
of  oscillations  and  the  convergence  rate  of  solutions  to  the  steady  state  all  de¬ 
pend  on  history  length,  and  it  generally  takes  longer  to  reach  the  steady  state 
for  longer  histories.  Another  conclusion  is  that  the  linear  transition  function 
converges  to  the  desired  distribution  faster  than  the  power  function,  at  least  for 
moderate  history  lengths. 

7  Discussion 

We  have  constructed  and  analyzed  mathematical  models  of  dynamic  task  alloca¬ 
tion  in  a  multi-robot  system.  The  models  are  general  and  can  be  easily  extended 
to  other  systems  in  which  robots  use  a  history  of  local  observations  of  the  envi¬ 
ronment  as  a  basis  for  making  decisions  about  future  actions.  These  models  are 
based  on  theory  of  stochastic  processes.  In  order  to  study  a  robot’s  behavior, 
we  do  not  need  to  know  its  exact  trajectory  or  the  trajectories  of  other  robots; 
instead,  we  derive  a  probabilistic  model  that  governs  how  a  robot’s  behavior 
changes  in  time.  In  some  simple  cases  these  models  can  be  solved  analytically. 
However,  stochastic  models  are  usually  too  complex  for  exact  analytic  treat¬ 
ment.  Thus,  in  the  scenario  described  in  Section  4.1  in  which  only  observations 
of  tasks  are  made,  though  the  individual  model  is  tractable,  the  stochastic  model 
of  the  collective  behavior  is  not.  Instead,  we  use  averaging  and  approximation 
techniques  to  quantitatively  study  the  dynamics  of  the  collective  behavior.  Such 
models,  therefore,  do  not  describe  the  robots’  behavior  in  a  single  experiment, 
but  rather  the  behavior  that  has  been  averaged  over  many  experimental  or  sim¬ 
ulations  runs.  Fortunately,  results  of  experiments  and  simulations  are  usually 
presented  as  an  average  over  many  runs;  therefore,  mathematical  models  of  aver¬ 
age  collective  behavior  can  be  used  to  describe  experimental  results.  In  fact,  the 
stochastic  model  produces  excellent  agreement  with  experimental  results  under 
all  experimental  conditions  and  without  using  any  adjustable  parameters. 

Phenomenological  models  are  more  straightforward  to  construct  and  ana¬ 
lyze  than  exact  stochastic  models  —  in  fact,  they  can  be  easily  constructed 
from  details  of  the  individual  robot  controller  [18].  The  ease  of  use  comes  at  a 
price,  namely,  the  number  of  simplifying  assumptions  that  were  made  in  order 
to  produce  a  mathematically  tractable  model.  First,  we  assume  that  the  robots 
are  functioning  in  a  dilute  limit,  where  they  are  sufficiently  separated  that  their 
actions  are  largely  independent  of  one  another.  Second,  we  assume  that  the  tran¬ 
sition  rates  can  be  represented  by  aggregate  quantities  that  are  spatially  uniform 
and  independent  of  the  details  of  the  individual  robot’s  actions  or  history.  We 
also  assume  the  system  is  homogeneous,  with  modeled  robots  characterized  by  a 
set  of  parameters,  each  of  them  representing  the  mean  value  of  some  real  robot 
feature:  mean  speed,  mean  duration  for  performing  a  certain  maneuver,  and  so 
on.  Real  robot  systems  are  heterogeneous:  even  if  the  robots  are  executing  the 
same  controller,  there  will  always  be  variations  due  to  inherent  differences  in 
hardware.  We  do  not  consider  parameter  distributions  in  our  models  as  would 
be  necessary  to  describe  such  heterogeneous  systems.  Finally,  phenomenological 


models  more  reliably  describe  systems  where  fluctuations  (deviations  from  the 
mean  behavior)  can  be  neglected,  as  happens  in  large  systems  or  when  many 
experimental  runs  are  aggregated.  However,  even  if  phenomenological  models 
don’t  agree  with  experiments  exactly,  as  we  saw  in  Section  6.2,  they  can  still 
reliably  predict  most  behaviors  of  interest  even  in  not-so-large  systems.  They 
are,  therefore,  a  useful  tool  for  modeling  and  analyzing  multi-robot  systems. 


8  Conclusion 

Mathematical  analysis  can  be  a  useful  tool  for  the  study  and  design  of  MRS 
and  a  viable  alternative  to  experiments  and  simulations.  It  can  be  applied  to 
large  systems  that  are  too  costly  to  build  or  take  too  long  to  run  in  simula¬ 
tion.  Mathematical  analysis  can  be  used  to  study  the  behavior  of  an  MRS, 
select  parameters  that  optimize  its  performance,  prevent  instabilities,  etc.  In 
conjunction  with  the  design  process,  mathematical  analysis  can  help  under¬ 
stand  the  effect  individual  robot  characteristics  have  on  the  collective  behavior 
before  a  system  is  implemented  in  hardware  or  in  simulation.  Unlike  experi¬ 
ments  and  simulations,  where  exhaustive  search  of  the  design  parameter  space 
is  often  required  to  reach  any  conclusion,  analysis  can  often  produce  exact  an¬ 
alytic  results,  or  scaling  relationships,  for  the  quantities  of  interest.  If  these 
are  not  possible,  exhaustive  search  of  the  parameter  space  is  much  more  prac¬ 
tical  and  efficient.  Finally,  results  of  analysis  can  be  used  as  feedback  to  guide 
performance-enhancing  modifications  of  the  robot  controller. 

In  this  paper  we  have  described  an  dynamic  task  allocation  mechanism 
where  robots  use  local  observations  of  the  environment  to  decide  their  task 
assignments.  We  have  presented  a  mathematical  model  of  this  task  allocation 
mechanism  and  studied  it  in  the  context  of  a  multi-foraging  task  scenario.  We 
compared  predictions  of  the  model  with  results  of  embodied  simulations  and 
found  excellent  quantitative  agreement.  In  this  application,  mathematical  anal¬ 
ysis  could  help  the  designer  choose  robot  properties,  such  as  the  form  of  the 
transition  probability  used  by  robots  to  switch  their  task  state,  or  decide  how 
many  observations  the  robot  ought  to  consider. 

Mathematical  analysis  of  MRS  is  a  new  field,  but  its  success  in  explaining 
experimental  results  shows  it  to  be  a  promising  tool  for  the  design  and  analysis 
of  robotic  systems.  The  field  is  open  to  new  research  directions,  from  apply¬ 
ing  analysis  to  new  robotic  systems  to  developing  increasingly  sophisticated 
mathematical  models  that,  for  example,  account  for  heterogeneities  in  robot 
population  that  are  due  to  differences  in  their  sensors  and  actuators. 
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